Comparison of clustering methods: A case study of text-independent speaker modeling

نویسندگان

  • Tomi Kinnunen
  • Ilja Sidoroff
  • Marko Tuononen
  • Pasi Fränti
چکیده

Clustering is needed in various applications such as biometric person authentication, speech coding and recognition, image compression and information retrieval. Hundreds of clustering methods have been proposed for the task in various fields but, surprisingly, there are few extensive studies actually comparing them. An important question is how much the choice of a clustering method matters for the final pattern recognition application. Our goal is to provide a thorough experimental comparison of clustering methods for text-independent speaker verification. We consider parametric Gaussian mixture model (GMM) and non-parametric vector quantization (VQ) model using the best known clustering algorithms including iterative (K-means, random swap, expectation-maximization), hierarchical (pairwise nearest neighbor, split, split-and-merge), evolutionary (genetic algorithm), neural (self-organizing map) and fuzzy (fuzzy C-means) approaches. We study recognition accuracy, processing time, clustering validity, and correlation of clustering quality and recognition accuracy. Experiments from these complementary observations indicate clustering is not a critical task in speaker recognition and the choice of the algorithm should be based on computational complexity and simplicity of the implementation. This is mainly because of three reasons: the data is not clustered, large models are used and only the best algorithms are considered. For low-order models, choice of the algorithm, however, can have a significant effect. Index Terms – Clustering methods, speaker recognition, vector quantization, Gaussian mixture model, universal background model List of abbreviations ANN Artificial neural network DET Detection error trade-off EER Equal error rate EM Expectation maximization FAR False acceptance rate FRR False rejection rate FCM Fuzzy C-means GMM Gaussian mixture model GA Genetic algorithm MAP Maximum a posteriori MFCC Mel-frequency cepstral coefficient PNN Pairwise nearest neighbor RS Randow swap SOM Self-organizing map SM Split-and-merge SVM Support vector machine UBM Universal background model VQ Vector quantization

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparison of Clustering Algorithms for Speaker Identification

In this paper we consider the problem of text-independent speaker identification that refers to acoustic recognition research. Many different techniques have been presented over past several decades. A stateof-the-art technique uses Gaussian Mixtures (GMM) for modeling speaker data distribution presented by MFCC [1] or LPCC [2] features. The classification is obtained by choosing the speaker cl...

متن کامل

A Joint Semantic Vector Representation Model for Text Clustering and Classification

Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...

متن کامل

Optimizing Spectral Feature Based Text-Independent Speaker Recognition

AUTOMATIC speaker recognition has been an active research area for more than 30 years, and the technology has gradually matured to a state ready for real applications. In the early years, text-depended recognition was more studied but gradually the focus has moved towards text-independent recognition because their application field is much wider, including forensics, teleconferencing, and user ...

متن کامل

A Chinese phoneme clustering theory and its application to a text independent speaker verification system

This paper presents a new idea of Chinese phoneme clustering and a text independent speaker verification system with this technique applied. It changes the way of conventional verification method with averaging features used, instead, both the dynamic and static features of speech are included in our new method. Also it leads to fast and efficient clustering algorithm in the training phase. The...

متن کامل

Text-independent Speaker Recognition by Trajectory Space Comparison

We present the principle of trajectory space comparison for text-independent speaker recognition and some solutions to the space comparison problem based on vector quantization. The comparison of recognition rate of diierent solutions is reported. Experimental system achieved 99.5% text-independent speaker recognition rate for 23 speakers, using 5 phrases for training and 5 for test. A speaker-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Pattern Recognition Letters

دوره 32  شماره 

صفحات  -

تاریخ انتشار 2011